8 research outputs found

    Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in 3D Scene Understanding

    Get PDF
    System designers typically use well-studied benchmarks to evaluate and improve new architectures and compilers. We design tomorrow's systems based on yesterday's applications. In this paper we investigate an emerging application, 3D scene understanding, likely to be signi cant in the mobile space in the near future. Until now, this application could only run in real-time on desktop GPUs. In this work, we examine how it can be mapped to power constrained embedded systems. Key to our approach is the idea of incremental co-design exploration, where optimization choices that concern the domain layer are incrementally explored together with low-level compiler and architecture choices. The goal of this exploration is to reduce execution time while minimizing power and meeting our quality of result objective. As the design space is too large to exhaustively evaluate, we use active learning based on a random forest predictor to nd good designs. We show that our approach can, for the rst time, achieve dense 3D mapping and tracking in the real-time range within a 1W power budget on a popular embedded device. This is a 4.8x execution time improvement and a 2.8x power reduction compared to the state-of-the-art

    Architecture support for intrusion detection systems

    Get PDF
    System security is a prerequisite for efficient day-to-day transactions. As a consequence, Intrusion Detection Systems (IDS) are commonly used to provide an effective security ring to systems in a network. An IDS operates by inspecting packets flowing in the network for malicious content. To do so, an IDS like Snort[49] compares bytes in a packet with a database of prior reported attacks. This functionality can also be viewed as string matching of the packet bytes with the attack string database. Snort commonly uses the Aho-Corasick algorithm[2] to detect attacks in a packet. The Aho-Corasick algorithm works by first constructing a Finite State Machine (FSM) using the attack string database. Later the FSM is traversed with the packet bytes. The main advantage of this algorithm is that it provides a linear time search irrespective of the number of strings in the database. The issue however lies in devising a practical implementation. The FSM thus constructed gets very bloated in terms of the storage size, and so is area inefficient. This also affects its performance efficiency as the memory footprint also grows. Another issue is the limited scope for exploiting any parallelism due to the inherent sequential nature in a FSM traversal. This thesis explores hardware and software techniques to accelerate attack detection using the Aho-Corasick algorithm. In the first part of this thesis, we investigate techniques to improve the area and performance efficiency of an IDS. Notable among our contributions, includes a pipelined architecture that accelerates accesses to the most frequently accessed node in the FSM. The second part of this thesis studies the resilience of an IDS to evasion attempts. In an evasion attempt an adversary saturates the performance of an IDS to disable it, and thereby gain access to the network. We explore an evasion attempt that significantly degrades the performance of the Aho-Corasick al- gorithm used in an IDS. As a counter measure, we propose a parallel architecture that improves the resilience of an IDS to an evasion attempt. The final part of this thesis explores techniques to exploit the network traffic characteristic. In our study, we observe significant redundancy in the payload bytes. So we propose a mechanism to leverage this redundancy in the FSM traversal of the Aho-Corasick algorithm. We have also implemented our proposed redundancy-aware FSM traversal in Snort.Postprint (published version

    Hardware/software mechanisms for protecting an IDS against algorithmic complexity attacks

    No full text
    Intrusion Detection Systems (IDS) have emerged as one of the most promising ways to secure systems in the network. An IDS like the popular Snort[17] detects attacks on the network using a database of previous attacks. So in order to detect these attack strings in the packet, Snort uses the Aho-Corasick algorithm. This algorithm first constructs a Finite State Machine (FSM) from the attack strings, and subsequently traverses the FSM using bytes from the packet. We observe that there are input bytes that result in a traversal of a series of FSM states (also viewed as pointers). This chain of pointer traversal significantly degrades (22X) the processing time of an input byte. Such a wide variance in the processing time of an input byte can be exploited by an adversary to throttle the IDS. If the IDS is unable to keep pace with the network traffic, the IDS gets disabled. So in the process the network becomes vulnerable. Attacks done in this manner are referred to as algorithmic complexity attacks, and arise due to weaknesses in IDS processing. In this work, we explore defense mechanisms to the above outlined algorithmic complexity attack. Our proposed mechanisms provide over 3X improvement in the worst-case performance.Postprint (published version

    Improving the performance efficiency of an IDS by exploiting temporal locality in network traffic

    No full text
    Network traffic has traditionally exhibited temporal locality in the header field of packets. Such locality is intuitive and is a consequence of the semantics of network protocols. However, in contrast, the locality in the packet payload has not been studied in significant detail. In this work we study temporal locality in the packet payload. Temporal locality can also be viewed as redundancy, and we observe significant redundancy in the packet payload. We investigate mechanisms to exploit it in a networking application. We choose Intrusion Detection Systems (IDS) as a case study. An IDS like the popular Snort operates by scanning packet payload for known attack strings. It first builds a Finite State Machine (FSM) from a database of attack strings, and traverses this FSM using bytes from the packet payload. So temporal locality in network traffic provides us an opportunity to accelerate this FSM traversal. Our mechanism dynamically identifies redundant bytes in the packet and skips their redundant FSM traversal. We further parallelize our mechanism by performing the redundancy identification concurrently with stages of Snort packet processing. IDS are commonly deployed in commodity processors, and we evaluate our mechanism on an Intel Core i3. Our performance study indicates that the length of the redundant chunk is a key factor in performance. We also observe important performance benefits in deploying our redundancy-aware mechanism in the Snort IDS[32]

    Improving the resilience of an IDS against performance throttling attacks

    No full text
    Intrusion Detection Systems (IDS) have emerged as one of the most promising ways to secure systems in the network. To be effective against evasion attempts, the IDS must provide tight bounds on performance. Otherwise an adversary can bypass the IDS by carefully crafting and sending packets that throttle it. This can render the IDS ineffective, thus resulting in the network becoming vulnerable. We present a performance throttling attack mounted against the computationally intensive string matching algorithm. This algorithm performs string matching by traversing a finite-state-machine (FSM). We observe that there are some input bytes that sequentially traverse a chain of 30 pointers. This chain of traversal drastically degrades performance, and we observe a 22X performance drop in comparison to the average case performance. We investigate hardware and software mechanisms to counter this performance degradation. The software mechanism is targeted for commodity general purpose CPUs. While the hardware-based mechanism uses a parallel traversal suitable for network processor architectures. Our results show that our proposed mechanisms significantly improves (by over 3X magnitude) string matching algorithm’s worst performing cases.Postprint (published version

    Hardware/software mechanisms for protecting an IDS against algorithmic complexity attacks

    No full text
    Intrusion Detection Systems (IDS) have emerged as one of the most promising ways to secure systems in the network. An IDS like the popular Snort[17] detects attacks on the network using a database of previous attacks. So in order to detect these attack strings in the packet, Snort uses the Aho-Corasick algorithm. This algorithm first constructs a Finite State Machine (FSM) from the attack strings, and subsequently traverses the FSM using bytes from the packet. We observe that there are input bytes that result in a traversal of a series of FSM states (also viewed as pointers). This chain of pointer traversal significantly degrades (22X) the processing time of an input byte. Such a wide variance in the processing time of an input byte can be exploited by an adversary to throttle the IDS. If the IDS is unable to keep pace with the network traffic, the IDS gets disabled. So in the process the network becomes vulnerable. Attacks done in this manner are referred to as algorithmic complexity attacks, and arise due to weaknesses in IDS processing. In this work, we explore defense mechanisms to the above outlined algorithmic complexity attack. Our proposed mechanisms provide over 3X improvement in the worst-case performance

    Improving the performance efficiency of an IDS by exploiting temporal locality in network traffic

    No full text
    Network traffic has traditionally exhibited temporal locality in the header field of packets. Such locality is intuitive and is a consequence of the semantics of network protocols. However, in contrast, the locality in the packet payload has not been studied in significant detail. In this work we study temporal locality in the packet payload. Temporal locality can also be viewed as redundancy, and we observe significant redundancy in the packet payload. We investigate mechanisms to exploit it in a networking application. We choose Intrusion Detection Systems (IDS) as a case study. An IDS like the popular Snort operates by scanning packet payload for known attack strings. It first builds a Finite State Machine (FSM) from a database of attack strings, and traverses this FSM using bytes from the packet payload. So temporal locality in network traffic provides us an opportunity to accelerate this FSM traversal. Our mechanism dynamically identifies redundant bytes in the packet and skips their redundant FSM traversal. We further parallelize our mechanism by performing the redundancy identification concurrently with stages of Snort packet processing. IDS are commonly deployed in commodity processors, and we evaluate our mechanism on an Intel Core i3. Our performance study indicates that the length of the redundant chunk is a key factor in performance. We also observe important performance benefits in deploying our redundancy-aware mechanism in the Snort IDS[32]

    Improving the resilience of an IDS against performance throttling attacks

    No full text
    Intrusion Detection Systems (IDS) have emerged as one of the most promising ways to secure systems in the network. To be effective against evasion attempts, the IDS must provide tight bounds on performance. Otherwise an adversary can bypass the IDS by carefully crafting and sending packets that throttle it. This can render the IDS ineffective, thus resulting in the network becoming vulnerable. We present a performance throttling attack mounted against the computationally intensive string matching algorithm. This algorithm performs string matching by traversing a finite-state-machine (FSM). We observe that there are some input bytes that sequentially traverse a chain of 30 pointers. This chain of traversal drastically degrades performance, and we observe a 22X performance drop in comparison to the average case performance. We investigate hardware and software mechanisms to counter this performance degradation. The software mechanism is targeted for commodity general purpose CPUs. While the hardware-based mechanism uses a parallel traversal suitable for network processor architectures. Our results show that our proposed mechanisms significantly improves (by over 3X magnitude) string matching algorithm’s worst performing cases
    corecore